

<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
  <meta charset="utf-8" />
  
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  
  <title>Host Maintenance &mdash; Ceph Documentation</title>
  

  
  <link rel="stylesheet" href="../../../_static/ceph.css" type="text/css" />
  <link rel="stylesheet" href="../../../_static/pygments.css" type="text/css" />
  <link rel="stylesheet" href="../../../_static/graphviz.css" type="text/css" />
  <link rel="stylesheet" href="../../../_static/css/custom.css" type="text/css" />

  
  
    <link rel="shortcut icon" href="../../../_static/favicon.ico"/>
  

  
  

  

  
  <!--[if lt IE 9]>
    <script src="../../../_static/js/html5shiv.min.js"></script>
  <![endif]-->
  
    
      <script type="text/javascript" id="documentation_options" data-url_root="../../../" src="../../../_static/documentation_options.js"></script>
        <script src="../../../_static/jquery.js"></script>
        <script src="../../../_static/underscore.js"></script>
        <script src="../../../_static/doctools.js"></script>
    
    <script type="text/javascript" src="../../../_static/js/theme.js"></script>

    
    <link rel="index" title="Index" href="../../../genindex/" />
    <link rel="search" title="Search" href="../../../search/" />
    <link rel="next" title="Compliance Check" href="../compliance-check/" />
    <link rel="prev" title="Developing with cephadm" href="../developing-cephadm/" /> 
</head>

<body class="wy-body-for-nav">

   
  <header class="top-bar">
    

















<div role="navigation" aria-label="breadcrumbs navigation">

  <ul class="wy-breadcrumbs">
    
      <li><a href="../../../" class="icon icon-home"></a> &raquo;</li>
        
          <li><a href="../../../cephadm/">Cephadm</a> &raquo;</li>
        
          <li><a href="../">CEPHADM 开发者文档</a> &raquo;</li>
        
      <li>Host Maintenance</li>
    
    
      <li class="wy-breadcrumbs-aside">
        
          
            <a href="../../../_sources/dev/cephadm/host-maintenance.rst.txt" rel="nofollow"> View page source</a>
          
        
      </li>
    
  </ul>

  
  <hr/>
</div>
  </header>
  <div class="wy-grid-for-nav">
    
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search"  style="background: #eee" >
          

          
            <a href="../../../">
          

          
            
            <img src="../../../_static/logo.png" class="logo" alt="Logo"/>
          
          </a>

          

          
<div role="search">
  <form id="rtd-search-form" class="wy-form" action="../../../search/" method="get">
    <input type="text" name="q" placeholder="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>

          
        </div>

        
        <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
          
            
            
              
            
            
              <ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../../../start/intro/">Ceph 简介</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../install/">安装 Ceph</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../cephadm/">Cephadm</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../rados/">Ceph 存储集群</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../cephfs/">Ceph 文件系统</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../rbd/">Ceph 块设备</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../radosgw/">Ceph 对象网关</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../mgr/">Ceph 管理器守护进程</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../mgr/dashboard/">Ceph 仪表盘</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../api/">API 文档</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../architecture/">体系结构</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="../../developer_guide/">开发者指南</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/intro/">简介</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/essentials/">必备知识</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/merging/">何时、合并了什么</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/issue-tracker/">问题追踪器</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/basic-workflow/">基本工作流</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/tests-unit-tests/">测试：单元测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/testing_integration_tests/">测试：集成测试(Teuthology)</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/running-tests-locally/">测试：在本地运行测试</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/dash-devel/">Ceph Dashboard 开发者文档 (之前是 HACKING.rst)</a></li>
<li class="toctree-l2"><a class="reference internal" href="../../developer_guide/jaegertracing/">Tracing 开发者文档</a></li>
<li class="toctree-l2 current"><a class="reference internal" href="../">Cephadm 开发者文档</a><ul class="current">
<li class="toctree-l3"><a class="reference internal" href="../developing-cephadm/">Developing with cephadm</a></li>
<li class="toctree-l3 current"><a class="current reference internal" href="#">Host Maintenance</a><ul>
<li class="toctree-l4"><a class="reference internal" href="#high-level-design">High Level Design</a></li>
<li class="toctree-l4"><a class="reference internal" href="#admin-interaction">Admin Interaction</a></li>
<li class="toctree-l4"><a class="reference internal" href="#components-impacted">Components Impacted</a></li>
<li class="toctree-l4"><a class="reference internal" href="#ideas-for-future-work">Ideas for Future Work</a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="../compliance-check/">Compliance Check</a></li>
<li class="toctree-l3"><a class="reference internal" href="../design/storage_devices_and_osds/">存储设备和 OSD 管理</a></li>
<li class="toctree-l3"><a class="reference internal" href="../scalability-notes/">Notes and Thoughts on Cephadm’s scalability</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../../internals/">Ceph 内幕</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../governance/">项目管理</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../foundation/">Ceph 基金会</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../ceph-volume/">ceph-volume</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../releases/general/">Ceph 版本（总目录）</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../releases/">Ceph 版本（索引）</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../security/">Security</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../glossary/">Ceph 术语</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../jaegertracing/">Tracing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../translation_cn/">中文版翻译资源</a></li>
</ul>

            
          
        </div>
        
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">

      
      <nav class="wy-nav-top" aria-label="top navigation">
        
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="../../../">Ceph</a>
        
      </nav>


      <div class="wy-nav-content">
        
        <div class="rst-content">
        
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
            
<div id="dev-warning" class="admonition note">
  <p class="first admonition-title">Notice</p>
  <p class="last">This document is for a development version of Ceph.</p>
</div>
  <div id="docubetter" align="right" style="padding: 5px; font-weight: bold;">
    <a href="https://pad.ceph.com/p/Report_Documentation_Bugs">Report a Documentation Bug</a>
  </div>

  
  <div class="section" id="host-maintenance">
<h1>Host Maintenance<a class="headerlink" href="#host-maintenance" title="Permalink to this headline">¶</a></h1>
<p>All hosts that support Ceph daemons need to support maintenance activity, whether the host
is physical or virtual. This means that management workflows should provide
a simple and consistent way to support this operational requirement. This document defines
the maintenance strategy that could be implemented in cephadm and mgr/cephadm.</p>
<div class="section" id="high-level-design">
<h2>High Level Design<a class="headerlink" href="#high-level-design" title="Permalink to this headline">¶</a></h2>
<p>Placing a host into maintenance, adopts the following workflow;</p>
<ol class="arabic simple">
<li><p>confirm that the removal of the host does not impact data availability (the following
steps will assume it is safe to proceed)</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">orch</span> <span class="pre">host</span> <span class="pre">ok-to-stop</span> <span class="pre">&lt;host&gt;</span></code> would be used here</p></li>
</ul>
</li>
<li><p>if the host has osd daemons, apply noout to the host subtree to prevent data migration
from triggering during the planned maintenance slot.</p></li>
<li><p>Stop the ceph target (all daemons stop)</p></li>
<li><p>Disable the ceph target on that host, to prevent a reboot from automatically starting
ceph services again)</p></li>
</ol>
<p>Exiting Maintenance, is basically the reverse of the above sequence</p>
</div>
<div class="section" id="admin-interaction">
<h2>Admin Interaction<a class="headerlink" href="#admin-interaction" title="Permalink to this headline">¶</a></h2>
<p>The ceph orch command will be extended to support maintenance.</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">ceph</span> <span class="n">orch</span> <span class="n">host</span> <span class="n">enter</span><span class="o">-</span><span class="n">maintenance</span> <span class="o">&lt;</span><span class="n">host</span><span class="o">&gt;</span> <span class="p">[</span> <span class="o">--</span><span class="n">check</span> <span class="p">]</span>
<span class="n">ceph</span> <span class="n">orch</span> <span class="n">host</span> <span class="n">exit</span><span class="o">-</span><span class="n">maintenance</span> <span class="o">&lt;</span><span class="n">host</span><span class="o">&gt;</span>
</pre></div>
</div>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>In addition, the host’s status should be updated to reflect whether it
is in maintenance or not.</p>
</div>
<div class="section" id="the-check-option">
<h3>The ‘check’ Option<a class="headerlink" href="#the-check-option" title="Permalink to this headline">¶</a></h3>
<p>The orch host ok-to-stop command focuses on ceph daemons (mon, osd, mds), which
provides the first check. However, a ceph cluster also uses other types of daemons
for monitoring, management and non-native protocol support which means the
logic will need to consider service impact too. The ‘check’ option provides
this additional layer to alert the user of service impact to <em>secondary</em>
daemons.</p>
<p>The list below shows some of these additional daemons.</p>
<ul class="simple">
<li><p>mgr (not included in ok-to-stop checks)</p></li>
<li><p>prometheus, grafana, alertmanager</p></li>
<li><p>rgw</p></li>
<li><p>haproxy</p></li>
<li><p>iscsi gateways</p></li>
<li><p>ganesha gateways</p></li>
</ul>
<p>By using the –check option first, the Admin can choose whether to proceed. This
workflow is obviously optional for the CLI user, but could be integrated into the
UI workflow to help less experienced Administators manage the cluster.</p>
<p>By adopting this two-phase approach, a UI based workflow would look something
like this.</p>
<ol class="arabic simple">
<li><p>User selects a host to place into maintenance</p>
<ul class="simple">
<li><p>orchestrator checks for data <strong>and</strong> service impact</p></li>
</ul>
</li>
<li><p>If potential impact is shown, the next steps depend on the impact type</p>
<ul class="simple">
<li><p><strong>data availability</strong> : maintenance is denied, informing the user of the issue</p></li>
<li><p><strong>service availability</strong> : user is provided a list of affected services and
asked to confirm</p></li>
</ul>
</li>
</ol>
</div>
</div>
<div class="section" id="components-impacted">
<h2>Components Impacted<a class="headerlink" href="#components-impacted" title="Permalink to this headline">¶</a></h2>
<p>Implementing this capability will require changes to the following;</p>
<ul class="simple">
<li><p>cephadm</p>
<ul>
<li><p>Add maintenance subcommand with the following ‘verbs’; enter, exit, check</p></li>
</ul>
</li>
<li><p>mgr/cephadm</p>
<ul>
<li><p>add methods to CephadmOrchestrator for enter/exit and check</p></li>
<li><p>data gathering would be skipped for hosts in a maintenance state</p></li>
</ul>
</li>
<li><p>mgr/orchestrator</p>
<ul>
<li><p>add CLI commands to OrchestratorCli which expose the enter/exit and check interaction</p></li>
</ul>
</li>
</ul>
</div>
<div class="section" id="ideas-for-future-work">
<h2>Ideas for Future Work<a class="headerlink" href="#ideas-for-future-work" title="Permalink to this headline">¶</a></h2>
<ol class="arabic simple">
<li><p>When a host is placed into maintenance, the time of the event could be persisted. This
would allow the orchestrator layer to establish a maintenance window for the task and
alert if the maintenance window has been exceeded.</p></li>
<li><p>The maintenance process could support plugins to allow other integration tasks to be
initiated as part of the transition to and from maintenance. This plugin capability could
support actions like;</p>
<ul class="simple">
<li><p>alert suppression to 3rd party monitoring framework(s)</p></li>
<li><p>service level reporting, to record outage windows</p></li>
</ul>
</li>
</ol>
</div>
</div>



           </div>
           
          </div>
          <footer>
    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
        <a href="../compliance-check/" class="btn btn-neutral float-right" title="Compliance Check" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
        <a href="../developing-cephadm/" class="btn btn-neutral float-left" title="Developing with cephadm" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
    </div>

  <hr/>

  <div role="contentinfo">
    <p>
        &#169; Copyright 2016, Ceph authors and contributors. Licensed under Creative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0).

    </p>
  </div> 

</footer>
        </div>
      </div>

    </section>

  </div>
  

  <script type="text/javascript">
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script>

  
  
    
   

</body>
</html>